Detecting and Revamping of X-Outliers in Time Series Database
نویسندگان
چکیده
Dataset with Outliers causes poor accuracy in future analysis of data mining tasks. To improve the performance of mining task, it is necessary to detect and revamp of outliers which are there in the dataset. Existing techniques like ARMA (Auto-Regressive Moving Average), ARIMA (AutoRegressive Integrated Moving Average) and Multivariate Linear Gaussian state space model don't consider the periodicity for outlier detection. The above methods are used to find out only Y Outliers which are present in Y axis. These methods are not applicable to detect the time at which the peculiar data occurs (so called X-Outliers). This paper focuses different methods for detecting and revamping of X-Outliers that have abnormal data according to a known periodicity. These are practically applied in fraud detection, Market-basket analysis and medical applications to detect certain abnormal diseases. First the data is modeled to get the trend of the data and to remove noises by means of kernel smoothing. Next the outliers are detected by similarity measurements. If the dataset has outliers it can be replaced by considering periodic indices from the historical dataset. The performance of system is measured by precision, recall and F Score. The proposed method is tested with three different time series datasets namely, Electricity power consumption dataset, Weather dataset and
منابع مشابه
Identification of outliers types in multivariate time series using genetic algorithm
Multivariate time series data, often, modeled using vector autoregressive moving average (VARMA) model. But presence of outliers can violates the stationary assumption and may lead to wrong modeling, biased estimation of parameters and inaccurate prediction. Thus, detection of these points and how to deal properly with them, especially in relation to modeling and parameter estimation of VARMA m...
متن کاملNew optimized model identification in time series model and its difficulties
Model identification is an important and complicated step within the autoregressive integrated moving average (ARIMA) methodology framework. This step is especially difficult for integrated series. In this article first investigate Box-Jenkins methodology and its faults in detecting model, and hence have discussed the problem of outliers in time series. By using this optimization method, we wil...
متن کاملDetecting Outliers in Exponentiated Pareto Distribution
In this paper, we use two statistics for detecting outliers in exponentiated Paretodistribution. These statistics are the extension of the statistics for detecting outliers inexponential and gamma distributions. In fact, we compare the power of our test statisticsbased on the simulation study and identify the better test statistic for detecting outliers inexponentiated Pareto distribution. At t...
متن کاملA Bayesian Approach for Detecting Outliers in ARMA Time Series
The presence of outliers in time series can seriously affect the model specification and parameter estimation. To avoid these adverse effects, it is essential to detect these outliers and remove them from time series. By the Bayesian statistical theory, this article proposes a method for simultaneously detecting the additive outlier (AO) and innovative outlier (IO) in an autoregressive moving-a...
متن کاملOutlier Detection in Multivariate Time Series via Projection Pursuit
This article uses Projection Pursuit methods to develop a procedure for detecting outliers in a multivariate time series. We show that testing for outliers in some projection directions could be more powerful than testing the multivariate series directly. The optimal directions for detecting outliers are found by numerical optimization of the kurtosis coefficient of the projected series. We pro...
متن کامل